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l_J \ Abstract 



By paying more attention to semantics-based tool generation, programming language semantics 



^ ' can significantly increase its impact. Ultimately, this may lead to "Language Design Assistants" 

I <\ incorporating substantial amounts of semantic knowledge. 

(N ' 

\ 1 The Role of Programming Language Semantics 

Programming language semantics has lost touch with large groups of potential users |g9[. Among the 
^""i \ reasons for this unfortunate state of affairs, one stands out. Semantic results are rarely incorporated in 
^ practical systems that would help language designers to implement and test a language under develop- 
^\ . ment, or assist programmers in answering their questions about the meaning of some language feature 
\ not properly documented in the language's reference manual. Nevertheless, such systems are potentially 
^ more effective in bringing semantics-based formalisms and techniques to the places they are needed than 
their dissemination in publications, courses, or even exemplary (but little-used) programming languages. 
^ , The current situation in which semantics, languages, and tools are drifting steadily further apart is 

5—1 ■ shown in Figure |l|. The tool-oriented approach to semantics aims at making semantics definitions more 
useful and productive by generating as many language-based tools from them as possible. This will, 
we expect, reverse the current trend as shown in Figure |^. The goal is to produce semantically well- 
founded languages and tools. Ultimately, we envision the emergence of "Language Design Assistants" 
incorporating substantial amounts of semantic knowledge. 

Table |^ lists the semantics definition methods we are aware of. Examples of their use can be found 
in ||4^. Petri nets, process algebras, and other methods that do not specifically address the semantics of 
programming languages, are not included. Dating back to the sixties, attribute grammars and denota- 
tional semantics are among the oldest methods, while abstract state machines (formerly called evolving 
algebras), coalgebra semantics, and program algebra are the latest additions to the field. Ironically, while 
attribute grammars are popular with tool builders, semanticists do not consider them a particularly inter- 
esting definition method. Since we will only discuss the various methods in general terms without going 
into technical details, the reader need not be familiar with them. In any case, the differences between 
them, while often hard to decipher because the field is highly fragmented and appropriate "dictionaries" 
are lacking, do not affect our main argument. 

Table ^ lists a representative language development system (if any) for the semantics definition meth- 
ods of Table |l[ The last entry. Software Refinery, which has its origins in knowledge-based software 
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Figure 1: Semantics, languages, and tools are drifting steadily further apart. 




Figure 2: In the tool-oriented approach, semantics, languages, and tools are kept together by Tool 
Generation (TG) and, ultimately. Language Design Assistants (LDAs). 







Semantics 


Definition in terms of 


Axiomatic |Q] 
Attribute grammars 12] 
Denotational p8[| 
Algebraic §] 

Structural operational 35]/ 
Natural |2| 
Action |3l| 

Abstract state machines ]19 
Coalgebraic 21] 
Program algebra Q 


Pre- and postconditions 
Attribute propagation rules 
Lambda-expressions 
Equations / rewrite rules 
Inference rules 

Action expressions 
Transition rules 
Behavioral specification rules 
Equations 



Table 1: Current approaches to programming language semantics. 
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Abstract state machines 
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Synthesizer Generator ]^6|] 
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ASF-FSDF Meta-Environ- 
ment |l| 
Centaur ]Q 

ASD [13| 
Gem-Mex § 
Software Refinery [^] 


Cornell University 

Technical University of Darmstadt 

CWI and University of Amsterdam 

INRIA Sophia- Antipolis 

CWI and University of Aarhus 
University of L'Aquila 
Reasoning Systems, Palo Alto 



Table 2: Some representative language development systems. 



environments research at Kestrel Institute, does not fit any of the current semantics paradigms. The 
pioneering Semanol system ]||] is, to the best of our knowledge, no longer in use and is not included. The 
systems listed have widely different capabilities and are in widely different stages of development. Before 
discussing their characteristics and applications in Section ^, we first explain the general ideas underlying 
the tool-oriented approach to programming language semantics. These were shaped by our experiences 
with the ASF-I-SDF Meta-Environment (Table ^ over the past ten years. Finally, we discuss Language 
Design Assistants in Section 

2 A Tool-Oriented Approach to Semantics 

The tool-oriented approach to semantics aims at making semantics definitions more useful and productive 
by generating as many language-based tools from them as possible. This affects many aspects of the way 
programming language semantics is practiced and upsets some of its dogmas. 

Table ^ lists some of the tools that might be generated. In principle, the language definition has to 
be augmented with suitable tool-specific information for each tool to be generated, and this may require 
tool-specific language extensions to the core semantics definition formalism. In practice, this is not always 
necessary since semantics definitions tend to contain a good deal of implicit information that may be 
extracted and used for tool generation. 



Scanner /Parser 
Prettyprinter 
Syntax-directed editor 

Typechecker 
(Abstract) interpreter(s) 
Dataflow analyzer 
Call graph extractor 
Partial evaluator 
Optimizer 
Program slicer 
Origin tracker 

Debugger 
Code generator 
Compiler 
Profiler 
Test case generator 
Test coverage analyzer 
Regression test tool 
Complexity analyzer (metrics) 
Documentation generator 
Cluster analysis tool 
Systematic program modification tool 

Table 3: Tools that might be derived from a language definition. 



The first entry of Table scanner and parser generation, is standard technology. Lex and Yacc are 
well-known examples of stand-alone generators for this purpose. Their input formalisms are close to 
regular expressions and BNF, the de facto standard formalisms for regular and context-free grammars, 
respectively. Unfortunately, for most of the other tools in Table |^ there are no such standard formalisms. 

The key features of the tool-oriented approach are: 

• Language definitions are primarily tool generator input. They do not have to provide any kind of 
theoretical "explanation" of the constructs of the language in question nor do they have to become 
part of a language reference manual. 

• An interpreter that can act, among other things, as an "oracle" to programmers needing help will 
be among the first tools to be generated. 

• Writing (large) language definitions loses its esoteric character and becomes similar to any other 
kind of programming. Semantics formalisms tend to do best on small examples, but lose much of 
their power as the language definitions being written grow. In the tool-oriented approach, semantics 
formalisms have to be modular and separate generation (the analogue of separate compilation) has 
to be supported. Libraries of language constructs become important. 

• The tool-oriented approach may require addition of tool-specific features to the core formalism. 
This leads to an open-ended rather than a "pure" style of semantics description. 

• The scope of the tool-oriented approach includes, for instance, 
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System 


Generated tools 


Semantic 
engine 


Synthesizer Gen- 
erator 


scanner /parser (LALR), prettyprinter, syntax-directed edi- 
tor, incremental typechecker, incremental translator, . . . 


incremental 

attribute 

evaluator 


PSG 


scanner/parser, syntax-directed editor, incremental type- 
checker (even for incomplete program fragments), inter- 
preter 


functional 

language 

interpreter 


ASF+SDF Meta- 
Environment 


scanner /parser (generalized LR), prettyprinter, syntax- 
directed editor, typechecker, interpreter, origin tracker, 
translator, renovation tools, . . . 


conditional 
rewrite rule 
engine 


Centaur 


scanner /parser (LALR), prettyprinter, syntax-directed edi- 
tor, typechecker, interpreter, origin tracker, translator, . . . 


inference 
rule engine 


ASD 


scanner /parser, syntax-directed editor, checker, interpreter 


conditional 
rewrite rule 
engine 


Gem-Mex 


scanner /parser, typechecker, interpreter, debugger 


transition 
rule engine 


Software Refinery 


scanner /parser (LALR), prettyprinter, syntax-directed edi- 
tor, object-oriented parse tree repository (including dataflow 
relations), Y2K/Euro tools, program sheer, . . . 


tree manip- 
ulation en- 
gine 



Table 4: Tool generation capabilities of representative language development systems. 



— Domain-specific and little languages 11, m. Many of the tools in Table | are as useful for 
DSLs as they are for programming languages. 



— Software maintenance and renovation tools Some of these are included at the end of 

Table 0. 



Compiler toolkits such as CoSy |T|, Cocktail Q, OCS P], SUIF pi, and PIM |, 111 



3 Existing Language Development Systems 

Table | summarizes the tool generation capabilities of the representative language development systems 
listed in Table ^. All of them can generate lexical scanners, parsers, and prettyprinters, many of them 
can produce syntax-directed editors, typecheckers, and interpreters, and a few can produce various kinds 
of software renovation tools. To this end, they support one or more specification formalisms, but these 
differ in generality and application domain. 

For instance, the Synthesizer Generator supports attribute grammars with incremental attribute 
evaluation, which is particularly suitable for typechecking, static analysis and translation, but less suitable 
for dynamic semantics. The ASF-I-SDF Meta-Environment supports conditional rewrite rules rather than 
attribute grammars, and these can be used for defining dynamic semantics as well. Software Refinery 
comes with a full-blown functional language in which a wide range of computations on programs can 
be expressed. Other systems provide more specialized specification formalisms. PSG, for instance, uses 
context relations to describe incremental typechecking (even for incomplete program fragments) and 
denotational definitions for dynamic semantics. Gem-Mex supports a semi-visual formalism optimized 



for the definition of programming language semantics and tool generation. It can generate a typechecker, 
an interpreter, and a debugger. 

Table His far from complete. Some other language development systems are SIS [30|, PSP |32], GAG 
[H], SPS ill, MESS m, Actress fill, Pregmatic Q, LDL and Eh [||]. Many of the tools listed 
in Table ^ are not generated by any current system. Ample opportunities for tool generation still exist 
in areas like optimization, dynamic program analysis, testing, and maintenance. 



4 Toward Language Design Assistants 

The logical next step beyond semantics-based tool generation would lead to a situation similar to that 
of computer algebra. Large parts of mathematics are being incorporated in computer algebra systems. 
Conversely, computer algebra itself has become a fruitful mathematical activity, yielding new results 
of general mathematical interest. In the case of semantics, we see opportunities for "Language Design 
Assistants" incorporating a substantial amount of both formal and informal semantic knowledge. The 
latter is found, for instance, in language design rationales and discussion documents produced by stan- 
dardization bodies. Development of such assistants will not only push semantics even further toward 
practical application, but also give rise to new theoretical questions. 

The Language Design Assistants we have in mind would support the human language designer by 
providing design choices and performing consistency checks during the design process. Operational 
knowledge about typical issues like typing rules, scope rules, and execution models should be incorporated 
in them. Major research questions arise here regarding the acquisition, representation, organization, and 
abstraction level of the required knowledge. For instance, should it be organized according to any of 
the currently known paradigms of object-oriented, functional, or logic programming? Or should a higher 
level of abstraction be found from which these and other, new, paradigms can be derived? How can 
constraints on the composition of certain features be expressed and checked? Another key question is 
how to construct a collection of "language feature components" that are sufficiently general to be reusable 
across a wide range of languages. 

Similar considerations apply to tool development. By incorporating knowledge about tool generation 
in the Language Design Assistant we can envision a Tool Generation Assistant that helps in constructing 
tools in a more advanced way than the tool generation we had in mind in the previous sections. 

To make this perspective somewhat more tangible, consider the relatively simple case of an if-then- 
else-like conditional construct that has to be modelled as a language feature component. Table |5| gives 
an impression of the wide range of issues that has to be addressed before such a generic conditional 
construct can be specialized into a concrete if-then-else-statement or conditional expression in a specific 
language. It is a research question to design an abstract framework in which these and similar questions 
can be expressed and answered. 

Another major question is how to organize the specialization process from language feature compo- 
nent to concrete language construct. The main alternatives are parameterization and transformation 
[p^ j. Using parameterization, specialization of the component in question amounts to instantiating its 
parameters. Since parameters have to be identified beforehand and instantiation is usually a rather simple 
mechanism, the adaptability /reusability of a parameterized component is limited. Using transformations, 
on the other hand, a language feature component is designed without explicit parameters. Specialization 
is achieved by applying appropriate transformation rules to it to obtain the desired specific case. Clearly, 
this approach is more flexible since any part of the language feature component can be modified by the 
transformation rules and can thus effectively act as a parameter. The relation between this approach of 
meta-level transformation and parameterized modules is largely unexplored. 



O What is the type of the expression controlling the selection of one of the two branches. 

O How is the controlling expression evaluated (short circuit vs. full evaluation)? 

O Is the controlling expression evaluated concurrently with other program parts (with 
speculative execution of the conditional as a special case)? 

O Can the controlling expression have side-effects? 

O Can the controlling expression cause exceptions? 

O Are jumps from outside into the branches allowed? 

O Is the selected branch evaluated concurrently with other program parts? 
O Can the evaluation of the selected branch cause side-effects? 
O Can the evaluation of the selected branch cause exceptions? 
O Does the evaluation of the conditional construct yield a value? 

Table 5: Some of the possible parameters of a generic conditional construct. 

Although we are not aware of research on Language Design Assistants from the broad perspective 
sketched here, there is some work pointing in the same general direction: 



• The Language Designer's Workbench sketched as future work in I^J, 25] has some of the same goals. 

• Action semantics also emphasizes libraries of reusable language constructs. 

• Plans (no longer pursued) for the Language Development Laboratory |2C] included a library of 
reusable language constructs, a knowledge base containing knowledge of languages and their com- 
pilers/interpreters, and a tool for language design. 



• The "design and implementation by selection" of languages described in |33, 25 1 is a case study in 
high-level interactive composition of predefined language constructs. 
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